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ADAPTIVE GOODNESS-OF-FIT TESTS BASED ON 
SIGNED RANKS 1 

By Angelika Rohde 

Weierstrafi-Institut Berlin 

Within the nonparametric regression model with unknown regres- 
sion function / and independent, symmetric errors, a new multiscale 
signed rank statistic is introduced and a conditional multiple test of 
the simple hypothesis I = against a nonparametric alternative is 
proposed. This test is distribution-free and exact for finite samples 
even in the heteroscedastic case. It adapts in a certain sense to the 
unknown smoothness of the regression function under the alterna- 
tive, and it is uniformly consistent against alternatives whose sup- 
norm tends to zero at the fastest possible rate. The test is shown to 
be asymptotically optimal in two senses: It is rate-optimal adaptive 
against Holder classes. Furthermore, its relative asymptotic efficiency 
with respect to an asymptotically minimax optimal test under sup- 
norm loss is close to 1 in case of homoscedastic Gaussian errors within 
a broad range of Holder classes simultaneously. 

1. Introduction. Consider the nonparametric regression model with n 
independent observations 

Yi = l(Xi) + ei, i = l,...,n, 

some unknown regression function I on the unit interval and design points 
< X\ < • • • < X n < 1. Throughout this paper, the errors are assumed to be 
independent and symmetrically distributed around zero, which in particular 
includes the heteroscedastic case. We postulate Lebesgue continuous error 
distributions in addition for the sake of simplicity. Within this model, we 
are interested in identifying subintervals in the design space where I deviates 
significantly from some hypothetical regression curve l Q . For this aim, we 
develop an exact multiple test of the simple hypothesis "I = l Q " against a 
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nonparametric alternative. The method does not require a priori knowledge 
of the explicit error distributions, and it provides simultaneous confidence 
statements about deviations of I from l a with given significance level for 
arbitrary finite sample size. 

For the power investigation of our test, we follow the minimax approach 
introduced by Ingster (1982, 1993), which permits the set of alternatives 
to consist of an entire smoothness class, separated from the null hypothesis 
by some distance S n converging to zero. Typically, the distance to the null 
hypothesis is quantified by some seminorm || • ||. Then for a given significance 
level a and some positive number 5 the goal is to find a statistical test <j> 
whose minimal power 

inf Eid) 

l&T: ||i-I ||><S 

is as large as possible under the constraint that E; o < a. Approximate 
solutions for this testing problem are known for various classes T and semi- 
norms || • ||; see, for instance, Ingster (1987, 1993) for the case of L p -norm 
and Holder and Sobolev alternatives, Ermakov (1990) for sharp asymptotic 
results with respect to the L2-norm and Sobolev alternatives and Lepski 
(1993) and Lepski and Tsybakov (2000) in case of the supremum norm. It 
is a general problem that the optimal test <f> may depend on T . 

In case of an integral norm || • ||, the problem of adaptive (data-driven) 
testing a simple or parametric hypothesis is investigated, for example, in 
Eubank and Hart (1992), Ledwina (1994), Ledwina and Kallenberg (1995), 
Fan (1996), Fan, Zhang and Zhang (2001), Spokoiny (1996, 1998), Hart 
(1997) and Horowitz and Spokoiny (2001, 2002). The general procedure is to 
consider simultaneously a family of test statistics corresponding to different 
values of smoothing parameters, respectively. As Spokoiny (1996) pointed 
out, the adaptive approach in case of the L2- n o rm leads necessarily to sub- 
optimal rates by a factor log log n. In particular, the tests in Fan (1996) and 
Spokoiny (1996) are based on the maximum of centered and standardized 
statistics and (up to this constraint) rate-optimal adaptive against a smooth 
alternative; see also Fan and Huang (2001). For our purpose, the supremum 
norm seems to be the most adequate distance. Within the continuous-time 
Gaussian white noise model, Diimbgen and Spokoiny (2001) have shown that 
in contrast to the Z^-case, adaptive testing with respect to sup-norm loss 
is actually possible without essential loss of efficiency. They propose a test 
based on the supremum of suitably standardized kernel estimators of the 
regression function over different locations and over different bandwidths in 
order to achieve adaptivity. Unfortunately, their testing procedure depends 
explicitly on homoscedasticity and Gaussian errors or errors with at least 
sub-Gaussian tails. If these assumptions are violated, the test may lose its 
exact or even asymptotic validity. Moreover, its asymptotic power can be 
arbitrarily small. 
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In the following section, a new multiscale signed rank statistic is intro- 
duced and a conditional test of a one-point hypothesis against a nonpara- 
metric alternative is developed. In the third section, its asymptotic power is 
studied in the setting of homoscedastic errors. A lower bound for minimax 
testing with respect to sup-norm loss is provided, which is explicitly given in 
terms of Fisher information. The test turns out to be rate-optimal against 
arbitrary Holder classes, provided that the Fisher information of the error 
distribution is finite. Moreover, a lower bound for its relative asymptotic 
efficiency with respect to an asymptotically minimax optimal test under 
sup-norm loss is determined, and the classical efficiency bound 3/V is recov- 
ered even over a broad range of Holder classes simultaneously. A numerical 
example illustrating our method is presented in Section 4. Possible exten- 
sions are briefly discussed in Section 5. All proofs are deferred to Section 
6. 

For asymptotic investigations, the design variables are supposed to be 
deterministic and sufficiently regular in the sense of the assumption 

(D) There exists a strictly positive and continuous Lebesgue probability 
density h on [0,1] of finite total variation such that Xi = H~ 1 (i/n), 
with H the distribution function of h. 

By substraction of l Q from the observations, we may assume without loss 
of generality that l Q = 0. Depending on the design density h, it is then 
assumed that under the alternative the regression function / belongs to some 
smoothness class 

H h (P,L) :={l/Vh\leH(J3,L; [0,1])}, 

where for any interval IcK, T~C((3, L; I) denotes the class of Holder functions 
on I with parameters (3, L > 0. In case < (3 < 1, 

H(J3, L; /):={/:!—> R||/(x) - f(y)\ < L\x - yf for all x, y G /}. 

If k < (3 < k + 1 for an integer k > 1, let 7i{(3,L\I) be the set of func- 
tions on I that are k times differentiable and whose kth derivative belongs 
to Tt((3 — k,L;I). We also write H(P,L) for Tt(f3, L; [0, 1]). In particular, 
Tth(P,L) coincides with Ti.(/3,L) for h(-) = 1, corresponding to equidistant 
design points Xi = i/n, i = 1, .. . ,n. 

2. The multiscale signed rank statistic. Inspired by the high asymptotic 
efficiency of Wilcoxon's signed rank test in simple location shift families [see 
Hajek and Sidak (1967)], the idea is to define a multiscale testing procedure 
combining suitably standardized local signed rank statistics. The construc- 
tion is related to the work of Dumbgen (2002), who used local rank statistics 
for a test of stochastic monotonicity. In the present context it will turn out 
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that the highest asymptotic efficiency is achieved by weighted local signed 
rank statistics. 

For some kernel function ip on [0, 1] to be specified later and any pair (s,t) 
with < s < t < 1, let ipst be the shifted and rescaled kernel on the interval 
[s,t], pointwise given by 

ip st (x) :=ip 

For notational convenience, we simply write ipjk for ipXjX k , Xj < X^. For 
any 1 < j < k < n let Rjk ■= (Rjk(i))i=j, with Rjk{i) the rank of \Yi\ among 
the k—j + l numbers \Yi\, I = j, . . . ,k. Define the local test statistic 

m „ E- =J ^ fc (^)sign(yo^ fc (i) 

if the denominator is not equal to zero; and set Tj^ equal to zero otherwise. 
The law of Tjk depends heavily on the unknown error distributions, but 
under the null hypothesis, the conditional distribution C(Tjk\Rjk) does not — 
even in case of heteroscedastic errors. Hence distribution-freeness may be 
achieved via conditioning on the ranks. Note that the denominator in (1) 
is the conditional standard deviation of the numerator given Rj^ under the 
null hypothesis. 

The question is how to combine these single test statistics in an adequate 
way. The following theorem acts as a motivation for our approach. 

Theorem 1. Let the test statistic T n be defined by 

T n := max {\T jk \ - J 2\og(n / (k - j))} , 

l<j<k<n v 

based on a continuous kernel tjj : [0, 1] — > H of bounded total variation with 
J tp(x)dx > 0. Let assumption (D) be satisfied. Then in case of independent 
identically distributed errors, 

£o(Tn\Rln) *w,Pq £{To), 

where 

n:= sup f ij;v,,w^^wi _^ 21o8(1/(g(t) _ g(s))) i 

0<s<t<l I \\lpstVh\\2 > 

with W a Brownian motion on the unit interval. 



fx — s\ 



Here, —> w p refers to weak convergence in probability. It follows from 
results in Diimbgen and Spokoiny (2001) that To is finite almost surely. The 
additive correction in the limiting statistic appears as a suitable calibration 
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for taking the supremum. For it is well known that the maximum of n 
independent A/"(0, redistributed random variables equals (21ogn) 1//2 + o p (l) 
as n — ► oo. 

For the testing problem as described in this section, we propose the con- 
ditional test 

9aV)-\ 1} if T n > K a (R), 

where n a (R) := argminc>o{P(T n < C\R) > 1 — a} denotes the generalized 
(1 — a)-quantile of the conditional distribution T n \R under the null hypoth- 
esis. For explicit applications, we determine K a (R) via Monte Carlo simula- 
tions which are easy to implement. This test is distribution- free and keeps 
the significance level for arbitrary finite sample size also in the heteroscedas- 
tic case. Since the test statistic is discrete- valued, exact level a is attained 
only for certain values a € (0, 1). In order to achieve arbitrary significance 
levels exactly, the test can be canonically extended to a randomized proce- 
dure. 



Remark (Simultaneous detection of subregions with significant deviation 
from zero). The conditional multiscale test may be viewed as a multiple 
testing procedure. For a given vector of ranks, the corresponding test statis- 
tic T n exceeds the (1 — a)-significance level if, and only if, the random family 

V a := {(Xj,X k )\l <j<k< n; T jk > ^2\og(n/{k - j)) + K a (R)} 

is nonempty. Hence one may conclude that with confidence 1 — a, the un- 
known regression function deviates from zero on every interval (Xj,Xk) of 
V a . 

Remark (The choice of the kernel function ip). If the design density 
is equal to 1, the limit To under the null hypothesis as given in Theo- 
rem 1 appears as combination of standardized kernel estimators for the 
regression function in the standard Gaussian white noise model dY(t) = 
l(t)dt + n~ x l 2 dW r (t), < t < 1. With a certain choice of the kernel tp de- 
pending on the class of alternatives, it coincides there with an asymptotically 
minimax optimal test statistic with respect to the supremum norm of the 
testing problem "/ = 0" against Holder alternatives [Dumbgen and Spokoiny 
(2001)]. This indicates that in the homoscedastic situation, our conditional 
test may achieve the highest asymptotic efficiency with the same choice of 
the kernel function. Here, the construction is as follows: For some Holder 
alternative 7i(P,L), let 7^ be the solution to the following minimization 
problem: 

(2) Minimize ||7|| 2 over all 7 E H((3, 1;R) with 7(0) > 1. 
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It is known that 7^ is an even function with compact support, say [— R, R], 
and 7/3(0) = 1 > |7s(x)| for x 7^ 0. To be consistent with the notation intro- 
duced above, the optimal kernel tfjg on [0, 1] is then pointwise defined by 
ijjp (x) = -fp(2Rx — R). It is worth noting that the solution 7/3 only depends 
on the first parameter f3 which shows that the procedure is automatically 
adaptive with respect to the second parameter L. In case < (3 < 1, the 
solution of (2) is given by 7i9 (x) = < 1}(1 - \xf). For > 1 an ex- 

plicit solution is known only for f3 = 2 [Leonov (1999)]. For details on how 
this function can be constructed numerically, see Donoho (1994) and Leonov 



3. Asymptotic power and adaptivity. In this section, the asymptotic 
power of our test is investigated in case of independent identically distributed 
errors. The asymptotic power of the above defined conditional test surely 
depends on the unknown error distribution as well as the design regularity. 
The subsequent Theorem 2 provides an extension of Lepski and Tsybakov's 
(2000) lower bound for the nonparametric regression setting with Gaussian 
errors to general symmetric error distributions with finite Fisher informa- 
tion. Additionally, the result includes the case of non-equidistant design 
points. 

Let / denote the Lebesgue density of the error distribution. In order to 
formulate the result on the asymptotic lower bound, let us introduce the 
following assumptions: 

(El) / is strictly positive and absolutely continuous on R with finite Fisher 
information 



The required positivity of the error density / in (El) just ensures that for 
any the shifted distribution Cg(Yi) = £(£« + 0) is absolutely continu- 

ous with respect to Co(Yi) = C{sj). Since we are dealing with noncontiguous 
alternatives, we are in need of a slightly stronger assumption than differen- 
tiability in quadratic mean, which would be equivalent to (El). 

(E2) There exists some positive constant 5q such that we have the expansion 



with a sequence r(9,S) = 0(l/log(l/|#|)) for \9\ — > 0, uniformly in 6 G 



(1999). 





(OA]- 
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Examples, (i) (Normal distribution). If / denotes the Lebesgue density 
of the A/"(0, <7 2 )-distribution, then /(/) = o~~ 2 and 

/ { {*wp) 1+5 " l ) f{z) dz = exp ( {(1 + 6)2 " (1 + 6)} &) " 1 

= 1*5(1 + 5)e 2 i(f)(i + o(e 2 )) 

for 5 uniformly bounded from above. 

(ii) (Double exponential distribution). Let / denote the density of the 
centered double exponential distribution with parameter A, that is, f(z) = 
2 _1 Aexp(— X\z\). Simple calculations provide the expansion 

/ {(/(* + 0)/f(z)) 1+5 - \}f(z) dz = \5(l + 5)e 2 \ 2 (l + O(0)), 

for 5 uniformly bounded from above, where A 2 = 1(f). 

Via Taylor expansion of (1 + x) 1+s up to the second order and the theo- 
rem of dominated convergence, assumption (E2) can be verified for several 
classical error laws, in particular for the logistic distribution which is of ex- 
ceptional interest in the theory of rank tests. For any J C [0,1], let || • ||j 
denote the sup-norm restricted on J, that is, \\l\\j := sup xg j 

Theorem 2. Let p n := ((logn) / 'nf / ( 2 ^+ 1 ) and define 

2L 1 /' 3 \ /V (2/H-i) 
(2/3 + l)/(/)|| 7/3 || 2 J 

Let the assumptions (D), (El) and (E2) be satisfied. Then for arbitrary 
numbers e n > with lim n ,_ >00 e n = and lim n ^ 00 (logn) 1//2 e n = oo we obtain 

limsup inf E^ n (Y") < a 

n^oo leH h (f3,L): 

\\lVh\\j>0-—Sn)d*p n 

for any fixed nondegenerate interval J C [0, 1] and arbitrary tests <j) n at sig- 
nificance level < a. 

Even in the knowledge of both smoothness parameters (/?, L) and the 
explicit error distribution which is unrealistic for many practical purposes, 
for any test <p n of {0} at significance level a, there exists an alternative 
I with pv^||j > (1 — £n)d*p n which will not be detected with probability 
1 — a — o(l) or larger. As expected, the smaller the design density in some 
location, the more difficult it is to detect there a deviation from zero. 




8 



A. ROHDE 



The next theorem is about the asymptotic power of the multiscale signed 
rank test, based on the kernel being the solution to the minimization prob- 
lem (2). We restrict our attention to Holder alternatives with smoothness 
parameter (3 < 1. Here the resulting kernel ipp is pointwise given by ipp{x) = 
(1 — \2x — 1|^). For (3 > 1, an explicit solution of (2) is known for [3 = 2 only; 
see above. For the sake of simplicity, we consider compact subintervals of 
(0, 1), which can be avoided by the use of suitable boundary kernels similar 
to those in Lepski and Tsybakov (2000). 



Theorem 3. Let (3 £ (0,1]. Let <^>* denote the multiscale signed rank 
test based on the kernel ipp. Assume that the first derivative of the error 
density exists and is uniformly bounded and integrable. Denote furthermore 
p n :=((logn)/n)^ 2 ^ +1 ) and 



d* :-- 



2L 1 //3 \ 0/(2/3+1) 



Let assumption (D) be satisfied and suppose that the modulus of continuity 
of the design density h is decreasing with at least logarithmic rate, that is, 
sup| x _ ?/ | <( 5 \h(x) — h(y)\ = 0(l/log(l/5)) as 5 — > 0. Then for arbitrary num- 
bers e n > with lim ra _» 00 e n = and lim n ,_ >00 (logn) 1//2 e n = oo we obtain 

liminf inf p,U*=l) = l 

\\lVh\\j>(l+e n )d*p n 

for any fixed compact interval J C (0, 1). 



The theorem says that if the underlying regression line / multiplied by the 
square root of the design density deviates from {0} by at least (1 + e n )d* p n , 
then the test rejects the null hypothesis with probability close to 1. Note that 
the testing procedure does not require knowledge of the design density h. Via 
the choice of the optimal kernel function, the test depends on the smoothness 
parameter (3, but in contrast to the tests proposed by Lepski and Tsybakov 
(2000) it remains independent of L. 



Relative asymptotic efficiency. The ratio (d* / d*)( 2 P +1 ^ 13 may be 
interpreted as lower bound for the relative asymptotic efficiency in the fol- 
lowing sense: Let (<j) n ) be a sequence of arbitrary level-a tests for the simple 
hypothesis / = 0. Let 5 n > such that 

liminf inf Ej<A n = a' > a. 

rwoo leH h (P,L): 
\\lVh\\j>S„ 



ADAPTIVE SIGNED RANK TESTS 



9 



Let m(n) be (smallest possible) sample sizes such that 



\\lVh\\j>S n 

Then under the conditions of Theorems 2 and 3, 




lirainf " > (dJd*)^ /(3 = n( f f (y) 2 dyY / 1 (/) . 
n-»oo m(n) \J J I 



In case of a Gaussian error density f = (j) 0a 2, the former bound equals 




which is well known from the classical theory for the Wilcoxon test under 
the assumption of constant alternatives. The existence of optimal tests for 
arbitrary error densities / is yet an open problem. In case of homoscedas- 
tic Gaussian errors, minimax optimal tests are provided by Diimbgen and 
Spokoiny (2001). Thus one single test has relative asymptotic efficiency close 
to 1 with respect to an asymptotically minimax optimal test under sup- 
norm loss for arbitrary Holder alternatives 7ih(P,L);L > 0. Sharp asymp- 
totic adaptivity is attained in addition over any range of Holder classes 
7ih(P, L); Li < L < L2, for some arbitrary constants < L\ < L2 < 00. This 
follows from the fact that the approximations in the proof hold uniformly 
in L as long as L stays uniformly bounded away from and 00. 

Sharp asymptotic adaptivity with respect to both parameters, (3 and L, 
is still an open problem. Nevertheless, under the conditions of Theorems 2 
and 3 we obtain the following 

Theorem 4 (Rate-optimality). Let 4> n be the conditional multiscale signed 
rank test at level a S (0,1), based on some positive continuous kernel tp 
of bounded total variation with J ^(x)d(x) = 1. Then for arbitrary (3 > 0, 
L > 0, there exist constants c(f3,L,i^) > d*((3,L) such that 



Adaptivity. Without the knowledge of the first parameter f3, the test 
achieves the optimal rate nevertheless. Note that 4> n depends neither on 
[3 nor on L. The same considerations concerning the proof as indicated 
above show that if the range of {(3, L) is restricted to some compact subset 
[/$i)/?2] x [L\,L2\ C (0, co) 2 , <p n is rate-adaptive in the usual setting, that is, 



lim inf 



inf 

leH h (/3,L): 

\\lVh\\io,i]>c(P, L rf)P: 



P z (0 n = l) = l. 



<n 



lim inf inf 

rwoc (/3,L)e[/3i,/3 2 ]x[L 1 ,L 2 ] 



inf 

leH h (f3,L): 

lVh\\ [0A] >c{/3,L^)p, 



P^ n = l) = l. 
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Fig. 1. 



Remark [Nontrivial power along a sequence of local alternatives 
{1/ \/n) n £-z<f\- In the literature, the power of a goodness-of-fit test is often 
investigated along a sequence of alternatives (^/\A^)neN- Against such local 
(but directed) alternatives, the proposed test has nontrivial power as well: If 
I is continuous with ||/|| S up > 0, then there exists some compact subinterval J 
of [0, 1] with \l{x)\ > t > for all x G J and some constant r > 0. The single 
test statistic \Tjk\ — (2log(n/(k — j))) 1 ^ 2 with maximal distance \Xj — 
under the constraint [Xj, X^] C J detects a deviation from {0} with asymp- 
totic probability arbitrarily close to 1 for sufficiently large r. Thus, the test 
is consistent against local alternatives (a n /) nS N whenever a n ■ \fri — > oo. 

4. Numerical examples. We illustrate the method with a sample of size 
n = 100 and independent errors drawn from the Student law with three de- 
grees of freedom. The design points are equidistant Xi = i/n, and the test 
statistic is based on the Epanechnikov kernel. Figure 1 shows the regression 
line with the observations. The estimated quantiles of the conditional test 
statistic T n given the vector of ranks of the absolute observation values are 
based on 999 Monte Carlo simulations. Here we obtained kq.i(R) = 1.4171. 
Figure 2(a) presents the minimal intervals of T>o.i, visualized as horizontal 
line segments and ordered along the y-axis in a place-saving manner. Fig- 
ure 2(b) presents the minimal intervals of rejection at the 0.1-level for an 
application of the multiscale test [Diimbgen and Spokoiny (2001)], which is 
based on the idea of homoscedastic Gaussian errors [the standardization 
by \/3 = Var(Studenta) 1 / 2 included]. Based on 999 Monte Carlo simula- 
tions as well, we found Ko.i = 1-8187. The procedure detects a wrong region 
[0.56,0.6]. 
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5. Extensions. 

5.1. Parametric hypotheses. Suppose that the null hypothesis / S {le\0 £ 
0} for some parameter space C M. d . If n denotes a y^-consistent estima- 
tor of the unknown parameter, the above described procedure is supposed to 
be applied to the vector of residuals, {Y% — h (-^Q))iLi- I n case of equidistant 
design points and the rectangular kernel, we conjecture that under sufficient 
regularity conditions on n and the parametric model, the limit under the 
null hypothesis of Theorem 1 has the form 



with W a Brownian motion on the unit interval, some continuous valued 
function g and Z a (i-variate standard normally distributed random vector. 
Z comes in via linear expansion of 6 n . The additional estimation of the 
parameter does not influence the additive correction. However, it destroys 
the finite sample validity of the conditional test, and a bootstrap procedure 
may be applied as an approximation. 

5.2. Sobolev alternatives. For (5 £ N and 1 < p < oo with (3p > 1, let 

Tift, L;p) := {I \ I is absolutely continuous and ||Z^|L < L}, 

where || • |L denotes the L p -norm. Replacing in the definition of p n , h n and 
d* the constant (3 by 7 := — 1/p and using that Lh7 n l{- / h n ) £ !F(P,L;p) if 
/ G l;p), the results of Theorem 2 extend to Sobolev classes of alterna- 
tives as long as the solution of (2) [with a Sobolov ball instead of 
H(f3, 1)] has compact support and is of finite total variation. Theorem 3 can 



To := sup 

0<s<t<l 



\W(t)-W(s) + (g(t)-g(s)yZ 




y/t^~s 



I 1 



- 



- 



— 



HI 1 



- 



la) 



Fig. 2. 
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be modified in the same way if in addition the corresponding solution of (2) 
is nonnegative — the final argument in step 3 (proof of Theorem 3) may be re- 
placed with a consideration as in the proof of Theorem 4. The nonnegativity 
constraint, however, reduces the range of possible Sobolev classes essentially 
to (3 = 1. An explicit solution in case (3 = 1 and p > 2 has been derived by 
Sz. Nagy (1941), which satisfies the above requirements in particular. 

5.3. Random design. We conjecture that the design assumption (D) can 
be extended to 

(D') There exists some constant c > such that 



whenever < a n < b n < 1 and liminfn^oo log(6 n — o n )/logn > — 1. 

Here, H n denotes the empirical distribution function of the design points. 
Note that (D) implies (D'). The latter condition is satisfied in particular 
with probability 1 if X\ , X n are the order statistics of n i.i.d. random 
variables with a density which is bounded away from zero. 

5.4. Multivariate design. A further perspective is the extension of the 
test to two- or even multidimensional design. One application is to detect 
simultaneously objects on a surface of different shape and size. However, 
there is no natural class of subsets like intervals one has to look at. Ad- 
ditionally, computational aspects play an increased role: In the univariate 
case the supremum is taken over 0(n 2 ) single statistics. In two dimensions 
already, the choice of all rectangles leads to 0(n 4 ). 

5.5. Error laws with point mass and nonsymmetric errors. If the errors 
are not restricted to be Lebesgue-continuously distributed, define the local 
ranks 



The resulting conditional test keeps the significance level. 

When the assumption of symmetry is violated, the test is not valid any- 
more. However, if it seems reasonable in some practical situation that at 
least Med(ej) = 0, i = 1, . . . ,n, one may analyze the data with multiscale 
sign tests as used in Diimbgen and Johns (2004) for the construction of con- 
fidence bands for isotonic median curves. Such a multiscale sign test will be 
working in a more general setting, but presumably with a considerable loss 
of efficiency in the Gaussian case. 
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6. Proofs. 



Proof of Theorem 1. Let us first introduce some notation. Let T n :- 
{(j, k)\l<j <k< n} and define the process X n on T n pointwise by 



1 



\ n r— ! 



i= 3 



Rjkji) 

k-j + : 



Since the error distribution is assumed to be symmetric, sign(ej) is stochas- 
tically independent of |ej|. Consequently under the null hypothesis, the vec- 
tor of signs (sign(Yi))™ =1 is stochastically independent of the rank vector 
R = R\ n . Moreover, sign(ej) are i.i.d. Rademacher variables. For notational 
convenience we write £j for sign(ej). 

The proof is partitioned as follows. In step 1, the conditions of Theorem 
6.1 in Diimbgen and Spokoiny (2001) are verified for the conditional process 
X n given the vector of ranks R. Second (step 2), the weak approximation of 
the conditional process by a Gaussian process in probability is established. 

Step 1. For any (j, k) 6 T n , let a 2 R (j, k) denote the conditional variance 
Var(X n (j, k)\R). The sub-Gaussian tails of the conditional process X n \R are 
an immediate consequence of Hoeffding's inequality: 



F(\X n (j,k)\>a ntR (j,k) V \R) 



E<M X *)& 



Rjkji) 



i=j 



k-j + 2 



Rjk{if 



1/2 



n 



R 



<2exp(-r/ 2 /2) 

for any rj > 0, uniformly over R and 1 < j < k < n. Let p n be defined by 

Pn((j, k), (j', k')f := \j - j'\/n +\k- k'\/n. 

In order to show the sub-Gaussian increments of X n \R with respect to p n , 
it turns out to be sufficient to consider pairs with j = f = 1 and k < k! = 
n, by the same arguments as used in Diimbgen (2002). For any rj > 0, an 
application of Hoeffding's inequality yields 



1 



RiJi 



Rikii) 



i=l i=l 



> \Jl — k/nrj 



R 



< 2exp(-(l - k/n)rj 2 /(2B)) 



with 



/ 1 n 
B = V a r^—^ ln (X l 



n + 1 ' 



n r— f k + 1 

i=i 



R 



14 A. ROHDE 

First note that B < 2B X + 2B 2 , where 



and 



(4) 2fc = Var(-L£^(*)^& - 4=E^W^f 



6 



P 



P . 



Hence it is sufficient to show that Bi < K{\ — k/n) for i = l,2 with some 
constant K > independent of P, k and n. Throughout this proof, K de- 
notes a generic positive constant depending only on tp and the design density 
h. Its value may be different in different expressions. Now 



(5) 



1 fc 

< - E^m^) " ^ifc(^)) 2 + K(l - k/n). 



n t 1 

i=i 



For notational convenience, we denote the scale (X^ — X±) by t\k- The finite 
total variation of ifi implies that ip{x) = Jj i dP(u) for all but at most 
countably many numbers x G [0, 1] , where P is some probability measure 
on [0, 1] and g is some measurable function with \g\ < TV{ip). For < z\ < 
Z2<1 let a be defined by u([zi,Z2]) ■= \g(x)\P(dx). Note that \ip{z\) — 
V'( 2; 2)| < a {[z\-,Z2\)- Let P n denote the empirical distribution function of the 
design points and define 



4( fc «) •- 



x — X-i x — X\ 



tin 



' lfe 



The sum in (5) is then bounded by 

k 



(6) 



1 



n 



i=l 



(7) 



Empq - xi)/ti„) - - ^i)Ai/<)r 



i=i 



< 



x,. 



(4 fen) ) 2 ^n(d 



X 



X, 



I{y G * G 4 fcn >}# n (ds) i/i(dy)^(dz) 



ADAPTIVE SIGNED RANK TESTS 15 



<K sup I{yeAi kn) }H n (dx) 
«s=rn.n JXi 



(8) < K sup (H n (yt ln + Xi) - # n (y*ifc + Xi)), 

2/6[0,l] 

where equality (7) follows by an application of Fubini's theorem. But the 
design assumption (D) implies that H — 1/n < < iJ pointwise. Therefore, 
the latter supremum in (8) is bounded by 

sup (H(yt ln + Xi) - ^(ytifc + Xi)) + 1/n < X / h(x)X(dx) + 1/n, 

2/G[0,l] «k 

which is bounded from above by X(l — k/n) for some constant K indepen- 
dent of n and k. In order to bound B2 in (4), define R±k(i) '■= J2?=k+i ^ 
\Yi\}; thus R\ n {i) equals R\k{i) + R\k{i) a.s. Then 

V^ + l / (K + l) n t-i (n+lj z 

2 n "' 2 
<X(l-/c/n) 2 +X- V 



n^ +l (n+1) 2 



<K(l-k/n) 



Consequently, X n \R has sub-Gaussian increments with respect to p n . 

For some totally bounded pseudometric space (T,p), T'cT and any 
e > 0, the covering number N(£,T' , p) is defined as the infimum of jJ7o over 
all 7o cf such that inf^g^ p{to,t) < e Vi £ T'. To finish step 1, we need 
to establish the bound for the covering numbers, 

x(O^) 1 / 2 , {(j, fc) e r n : fc)^ < Pn ) < Au-H- 1 

with a constant A > 0, independent of i? and n. Since ^ is continuous with 
Jq 1 -(/"(x) dx > 0, there exists some nondegenerate interval [a, b] C [0, 1] with 
ip(x) 2 > r for some strictly positive constant r and any x E [a, 6]. Let Bj k : = 
{i : (Xj — Xj)/tjk £ [a, &]}■ By assumption (D), 

^ = / dH n (x) 

n Jt jk a+X 3 

1 k — i — 1 IK 
> H(t jk b + Xj) - H(t jk a + Xj) - - > K J - '—. 

n n 

This entails the lower bound 

\2 
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>Ly t 



= l r (P jfc )(P ifc + l)(2p jfc + l) > K k-j-l/K 
n 6(/c— j+2) 2 n 

with some constant K > 0, independent of R,k,j and n. Therefore, 

N^u) 1 / 2 , {(j, k)eT n : a n , R {j, kf < 5},p n ) 

< iV((5u) 1/2 , {(j, k)£T n :(k- j)/n <(8 + l/n)/K},p n ). 

If S > 1/n, then 5+1/n < 25, and via the embedding k i— > k/n oiT n into [0, 1], 
the covering number can be bounded by Au~ 2 5~ l for some constant A > 
with the same argument as given in Diimbgen and Spokoiny (2001). Note 
that the desired bound is necessarily satisfied for 5 < 1/n: Then J){(j, k) £ 
T n :(k- j)/n <{5 + l/n)/K} < k)eT n :{k- j) < 2/K] < 2K~ x n < 
2K~ l 5~ 1 . 

Step 2. Let S n := {(Xi,Xj)\0 <j<k<n}, where X := 0. Redefine the 
process X n on S n via 

X n (s,t) := -L £ ^(^fc-ML, (s,t)GS n , 



St 



where J s t := E and i? s t denotes the rank of |Yj| among the jJJ s t 

numbers |Yfc| : G [s,t]. Furthermore, let the process Z on S := {(s,£)|0 < 
s < t < 1} pointwise be defined by 

Z(s,t):= ±= j^ st {x)Jh{x)dW{x), («,*)€ 5, 

with some Brownian motion on the unit interval. In the sequel we prove 
the weak convergence in probability of the conditional process under the 
null hypothesis, that is, 

d w (jC(X n \R),£(Z(s,t)) {Sjt)eSn )^ p 0, 

where d w denotes some metric generating the topology of weak convergence. 
It follows by a standard chaining argument and the above established results 
that uniformly over R and n, X n \R is stochastically equicontinuous with 
respect to p, pointwise defined by 

p((s,t), (s',t')) 2 := \H(s) - H(s')\ + \H(t) - H(t')\. 

To prove the weak convergence in probability, it is therefore sufficient to 
show the convergence of the finite-dimensional distributions of X n \R. Let 

V n Ust + 1 
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Then X n (s,t) = Ya=i 0i,n(s,i), an d the fa^ are independent conditioned on 
R. One verifies that 



Kn\\% n 



R < 



\i=l 

and for arbitrary u > 0, 

E(E'{ll<Mi>«}|l<Ml 



2 

sup 



i? 



O(l). 



For any natural number k, let now {(si,t\), . . . , (sfc,tfc)|0 < Sj < ij < l,z 



1, ...,k} and 5^" = {(si n ,ti n ), • • • , (skn,t kn )} C 5 n such that (s ni ,t r 



for i = 1, . . . , k. For a given vector R of ranks, let us introduce the process 
Z nR on S n which is, conditioned on R, a centered Gaussian process with 
conditional covariance structure as X n \R, that is, 



(9) 



cov (Z nR (s, t), Z nR (s' ,t')\R) 

1 / fv\i rv \ Rst ^ R s't'(.i) 

Since the conditional covariance function of X n \R is uniformly bounded by 
ztll^Hg , respectively, Lindeberg's central limit theorem entails that 

d w (£(X n \ S k\R),£(Z nR \ S k\R)) — > 0, 

due to the compactness of [- 

(10) d w (C(Z nRls u\R),C(Z n]sk )) —►,(). 

Let (s n ,t n ) G 5 n with liminf n \s n — t n \ > 0. Then 



2 

sup ' 



g ]. It remains to be shown that 



< sup 



(F(\Yi\) - F(-\Y t \)) 

3— 53 /{l^i^H}-^!)-^-^)) 



and the latter quantity is o p (l) by the Glivenko-Cantelli theorem. This 

shows that for (s n ,t n ), (s' n ,t' n ) G 5 n with (s n ,t n ) — > (s,i) G 5 and — > 
(s',t') G 5, (9) is equal to 

cov(X n (s n ,t n ),X n (s' n ,t' n )\R) 



n 
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The random variables sign(Y;){F(|y;|) -F(-\Yi\)} = 2F(Yj) - 1, i = 1, . . . , n, 
are independent and uniformly distributed on [—1,1]. Consequently, assump- 
tion (D) and an application of Chebyshev's inequality finally yields 

cov(X n (s n ,t n ) , X n (s' n ,t' n )\R) — > p I J ip s t(x)ip s >t>(x)h(x)dx 

which implies (10). 

From steps 1 and 2 the asserted stochastically weak convergence of our 
test statistic can be deduced with the same argument as given in Diimbgen 
(2002), page 528. □ 



Proof of Theorem 2. For a fixed smoothness class H.(P, L), let 7 = 7/3 
be the solution of the optimization problem (2). As pointed out in Section 
2, 7 is an even function with compact support, say [— C, C]. Now define the 
following set of testing functions: For a given bandwidth h n > and any 
integer j let 

7j,n(-) :=7^ : — ^ l ^ Chn ^j an d define := —^==Lh^ j>n . 

[Note that h(-) denotes the design density whereas h n denotes the n-dependent 
scale parameter.] Let [a, a + b] C J for some b > and define 

Jn := {j e N : (2j - l)C/i n £ [a + C/i n , + 6 - C7i n ]}. 

Let Q n := {gj, n -j G Jn}- Note that # G 7ih(P,L) for every 5 G Q n . Following 
the arguments in Diimbgen and Spokoiny (2001), proof of Theorem 3.1a, 
one shows that for any test 4>:M n — > [0, 1] with significance level < a, 



inf E g( j)(X,Y) -a<E 
g&Gn 



^ ^ 



The aim is to determine h n such that the right-hand side tends to zero as 
n goes to infinity. Define the index set I g := {i\g(Xi) > 0}. By construction, 
I g H I g ' = for g / g' and g, 5' G Q n - Then for any g £ Gn, the likelihood 
ratio equals 

^ ( /«) ' 

which shows that dP s /dPo(^j ^0> 5 £ 5nj are independent. Note that their 
expectation is not the same for every g. Using a standard truncation argu- 
ment as in Diimbgen and Walther (2008), proof of Lemma 10, it turns out 
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to be sufficient to find h n such that 



^o]S(I^ E °l« (X ' y) 

6e(pM9zG» (Wn) 5 t = \U K J \ f(y) J I 

as n — ► oo. Using the expansion of assumption (E2), (11) is equal to 

1 n f 1 1 
i£ max — [] 1 + -5(1 + 5)/(/)g(I t ) 2 (l + ^(5(1,)^)) • 
<5e(o,5 ] geQ n {%y n ) t=i 2 ^ 

But for /i n sufficiently small, the latter expression is bounded by 

(12) inf maxexp(ni<y(l + <5)/(/)|| g ||2 2 (1 + f( ff )) - <51og(^ n )), 
<5e(o,s ] ssSn 

using the series representation of the logarithm, where 

1 n 

\\g\\n,2 ■= -J2g(Xi) 2 

and f(p) := sup 5e(0A] sup^^] |r (^(x), <5) | . Furthermore, 
1 n /" 



,. c f- Jxt-i v /i(^) /i(x) ; 



\2 C^,\2 



<L 2 /if ]r SU P 



h(Xi) h{x) 



The last expression is of order 0(/i^n _1 ): Since the design density h is 
of bounded total variation as well as uniformly bounded away from zero, 
also 1/h is of bounded total variation. In addition, 7 is bounded and of 
bounded total variation (for (3 < 1, 7 is explicitly known and unimodal, 
while its first derivative is Holder-continuous in case > 1). Consequently, 
TV(j] Jh) < K{ TV{^ 2 ) + TV{h)) < 00 with some constant K independent 
of j and n, which shows that \\gj, n 

||2 2 = /i 2/3+l|| 7 ||2 (1 + ((/ lnn )-l). Thus 

(12) is bounded by 

inf maxexp(ra±<5(l + 5)I{f)L 2 hf +1 h\\ 2 2 {l + R(n, g)) 
5e(0, So] g&Gn 

(13) 

-<Hog(^)), 
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with a sequence R(n,g) of order 0(max{(/i n n) _1 , f(g)}). 

Let e n > be arbitrary numbers with e n — > and e n yTogn — > oo. Define 
the bandwidth 

„„ := (^) W (1 _ E „ )W , 

which implies that sup g£ g n R(n, g) in (13) is of order (log re) -1 . By the choice 
of Q n , $G n > b/(2Ch n ) — 1. Let 5 = 5 n := e n . Then (13) is bounded by 

exp( £n (l + e n )(2(3 + l)" 1 logn(l - e„) (2/3+1)//3 
-e n (2/3 + l) _1 (logra-loglogTi) + o(l)) 

= exp^-i^e^ (1 + 0(£n)) logn + £n(2/3 + ^-i loglogn + o(1) ^ 
which tends to zero as n goes to infinity. □ 



Proof of Theorem 3. By virtue of the proof of Theorem 1, the condi- 
tional process X n \R satisfies the conditions of Theorem 6.1 of Dumbgen and 
Spokoiny (2001) uniformly in R and n. This entails that there exists some 
constant C > independent of n with k™(R) < C, where k^(R) denotes the 
(1 — a)-quantile of C(T n \R) under the null hypothesis. Consequently, 

P/(C = 1) = / PlC^n > K^(R)\R)dPi(R) 

> jF l (T n >C\R)dF l (R)=¥ l (T n >C). 

Furthermore, Fi(T n > C) > Fi(\T jk \ > C + ^2log(n/(k - j))) for any 1 < j < 
k < n. It is therefore sufficient to show that for any sequence l n G TLh{(3,L) 
with maximal absolute value p n v^||sup > d* p n (l +e n ), there exists a se- 
quence of pairs (j n , k n ) with 1 < j n < k n < n such that 

hminfP Jn (|T infc J >C + ^/21og(n/(fe n -j n ))) = 1. 

The proof is organized as follows: At first (step 1), the L2-approximation 
of the numerator of Tj n ) Cn by a sum of independent random variables is 
established. Second (step 2), Taylor-type expansions of its expectation and 
variance are provided, and the asymptotic power of our test is determined 
along sequences of alternatives converging to zero at the fastest possible 
rate. Finally (step 3), we treat alternatives converging to zero at a slow rate 
or staying uniformly bounded away from zero. 

Step 1 . Let I n := {j n , . . . , k n } be an interval of indices with 1 < j n < k n < 
n and (J/ n = k n — j n + 1 — > oo. For notational convenience, denote ijj n := V^fe™ 



ADAPTIVE SIGNED RANK TESTS 21 

and R n (i) := Rj n k n ^S)->^ e ^n- Let De the (normalized) numerator of the 
single local test statistic 7j n fc n , that is, 

S n :=^=Y,MX i ) S ign(Y l ) 1 



(14) 

In the sequel, we establish the approximation of S n by a sum of indepen- 
dent random variables which is up to O p (l/$I n ) its Hajek projection [see, 
e.g., van der Vaart (1998)]. For that purpose the Hoeffding decomposition is 
applied. With a = c n<i := (Un)~ V2 ^n{Xi), let Mj ■= sign(y i )c i /{|y i | < \Y±\} 
and define Hij := Aij + Aji. Then 



1 „ ^ 1 



s n -■ E E ttJ + ! H a + E ,jj + 

With the definition 

tfy :=E(5 n |y i ,y i ) -E(S n |yi) -E(5 n |r 3 -) + E(S n ) 
= fly - E(H l3 \Yi) - EiHijlYj) + Efly 

for i 7^ j, we obtain the decomposition 
1 



+ E (jT^T + E 57^TT OW*) -Eflij) 
= ■ 5(0) _j_ 5(1) 

where S*! ^ and Sn^ are uncorrelated. Note that in particular Efly = and 
cov(Hij, Hki) = for / (fc,Z). Consequently 

Var(g n -gW)= ] 1 J2 E Vai(fly) 

< - V V 4c 2 . 

y * n ' iei n jei„:j<i 

= 0{l/$I n ), 
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since by construction, Vav(Hij) < Var(fZjj). Furthermore, Sn^ is equal to 
— — — sign(K) 

r a/„ + i 



+ E jy^sign^^D-^-M)) 

+ E rTTrlf sign(y)dF l (y)-E(H l3 )\, 
.jri Wn + l U K\r-|y,-|,|y,-|i J 



where Fi denotes the distribution function of Yi . For any distribution func- 
tion F, let G be pointwise defined on E+ by G(t) := F(t) - F(-t-), with 
F(y—) the limit on the left, that is, \\m X/ ? y F{x). We denote F := 
VOUn) E ie /„ F U G(t) := F(t) - F(-t-) and := l/(jJJ n ) £ ie/n ^nM^i- 
Then E(si 1} - S n ) 2 = 0(l/tfJ„), with 

S n := -L= ]T UniXi) sign(Yi)G(|Yi|) 

(15) + / siga(y)dF^(y) 

J R\[-|y|,|y|] 



E/ signfo)df%)l. 

7 M\Hy|,|y|] 



Step 2. For two functions / and g in L2[0, 1], let 
(/,5}/„--=l/(tt4)E/(^)5(^) 

1 /2 

and let ||/||j„,2 := (/>/)/ denote the corresponding norm. Let (Z n ) be a 
sequence of alternatives. If M(l n ) denotes the maximal point of \l n \, let 
(Xj n , Xk n ) be the design points which are closest to M(l n ) — h n and M(l n ) + 
h n , respectively, where h n := (Sn/L) 1 ^ with 8 n := d* p n (l + e n ). Symmetry 
considerations show that we may assume without loss of generality that l n 
is positive at A^(/ n ). Besides the restriction ||Z ra \//i|| SU p > d*p n (l + e„), it is 
assumed in this paragraph that 

(16) ||g|sup/Pn = 0(l), 

which is equivalent to ||ZnV^||sup/pn = 0(1). Note that (16) implies \/Pn x 
\\ln\\j n ,2 = o(l). 
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Our first goal is to show that 

( i7) -S^= = vWfi ^fr'f"* 7 " / f(y) 2 d v + °(!) 



/ V ar^5 n IIWU„,2 

for any sequence (Z n ) satisfying (16). The symmetry of the error distribu- 
tion around zero and the boundedness of the first derivative /' provide the 
expansion 

si g n(y i )G(|y i |) 

= S iga(Y i )S^(F(\Y i \)-F(-\Y i \)) 

-(/(|F|)-/(-|F|))(^ E UXj))+0^(\\ln\\lA 

= (2F(y i )-i) + o unif (p n ||f„ i2 ). 

Here and in what follows, a sequence of random variables (2fn) is O un if(c n ) 
with a sequence of positive numbers (c n ), if limsup n \Z n /c n \ < c < oo with 
some nonrandom nonnegative constant c. In order to treat the expectation 

®i n Sn = -j^ E M*i){f (2^(y) - i)^(y) + o(||z n |l, 2 )}, 

first observe that for any SeK, J R (2F(y) - l)f(y + 6)dy = f R f'(t) x 
fi_ e (2F(y) — 1) dydt, using Fubini's theorem and the symmetry of the error 
density /. Taylor expansion of the inner integral entails that 

E ln S n = VWn{^nX)i n {- f (2F(y) - 1) f (y) dy\ + v1^0(p n ||? n>2 ) 

(18) 

= 2VK&nJn)lJ [ f(y) 2 dy\+VKO(\\ln\\ 2 I n , 2 ), 



where the last equality is obtained via partial integration. Furthermore, 
Var Zn ( -L= E MXi)s^(X)G(\Yi\)) 

(19) 

= 77- E MXif^m) - 1) 2 + o(iigi? n , 2 ). 

In order to bound the variance of the second part in the approximation (15), 
namely 

(20) Var in f _L= E / sign(y) dF^(y)) 

\vVnit?J R\[-\Yi\,\Yi\] J 
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(21) < E *n 



Slg 



^(y)dF^(y) 



note that by the symmetry of sign(-) and Fubini's theorem, 
/ siga(y)dF^(y) 

J [-z,z]<= 

-^]TVnPQ) / /'(*)/ -sign(y)I{ye[t,t + Z re (Xi)]}dy^ 

Un f—f JTSL J \-z,z] c 

'tin 

<{i>n,\ln\)l n ! \f'(t)\dt. 

Jr 

This shows that (20) is 0(||in||f 2) by Cauchy-Schwarz. Furthermore, 



(2F(y)-lfd(F i (y)-F(y)) 

= [ (2F(y) - l) 2 f 
Jr Jy-in 



y-l n (Xi) 



-f'(t)dtdy 



fit) 



t+l n (Xi) 



-(2F(y)-l) 2 dydt 



l n (Xi) f Af(t) 2 (2F(t)-l)dt + 0(l n (X, i ) 2 ), 

JR 



where the latter integral is equal to zero by the symmetry of the error 
distribution. This finally gives together with (19) and the bound of (20) 

(22) VmJn = UM\,2 + 0{\\ln\\ 2 I n ,'l). 

Note at this point that Var/ n S n is uniformly bounded from above and from 
below. Thus the combination of (18) and (22) entails (17) for any sequence 
(l n ) satisfying (16). 

In the next step, it will be shown that the denominator of Tj n k n is a 

sufficiently good approximation for the standard deviation of S n under the 
sequence of alternatives l n . Remember that it is the conditional standard 
deviation given the vector of ranks of the numerator under the null hypoth- 
esis. Using the representation R n (i) =J2kei n I{\Yk\ < a- s - ? one verifies 
that 

Rn(i) 2 



1 



(Un 



1) 



and analogously for i,j £ I n with i^j 

Rn(rf Rn(j? 



El 



(Un + l) 2 (Un + l? 



^11^111,2 + 0(11^111,2), 



Y i ,YA=G(\Y i \YG(\Y j \y + O nnit (l/Vn) 
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and 

which by Chebyshev's inequality shows in particular that under condition 
(16) 

1/2 



'Un- 



»n\\I n ,2 

(23) 



Since G(-) is uniformly bounded by 1, the Lindeberg condition is easily 
verified for S n . Then Lindeberg's central limit theorem yields in combination 
with the result from step 1, (17) and (23) 



fl n (T jnkn >C + y/2log(nnin)) 

= 1 - $(C + v /21og(n/tt/ n ) - VuVKj^j^ J f(y) 2 dy^j + o(l), 

with $ the standard normal distribution function. It remains to be shown 
that 



(24) VUV^Y^^ I f(y? dy - V / 21og(n/tt/ n ) 



oo 



as n goes to infinity under the constraints ||Znv^||sup 

> d*p n (l + e n ) and 

(16). 

Under the assumptions about the kernel ip and the design density h, argu- 
ments involving bounded total variation of V> and h yield the approximation 



nVK^n^ I f{y? dy - p\og{n/Un) 

\\Vn\\I n ,2 J V 

^nJnVh) I }{y) 2 dy _ v / 2 l0g(n/(t|J ra ))+ (l). 



.. y n||/n,2 

(25) 



y n 2 



Let ij}^ be the kernel rescaled to the interval [M(l n ) — h n ,M(l n ) + h n \. 
Then 

~N^ = ||^)|| 2 < 1 + ^ ))' 

using that X jn - (M (i„) - K) = (Din" 1 ) and X fcn - (M(l n ) + /i n ) = 0{n~ l ) 
by assumption (D). But 5 n tp^ by its construction as well as l n y/h are el- 
ements of 7i(P,L). Then as in Diimbgen and Spokoiny (2001), a convexity 
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argument yields the inequality 



|2 || ^ v '||2 

One verifies that 

V^V^d /(l/j^^n^llT^^l + OlK)" 1 )) " V / 2bg077O +o(] 

> e n (2/(2/3 + 1)) 1/2 v/loi^+ o(l) oo 
and therefore (24) follows in combination with (25) and (26). 

Step 3. Suppose now that there exists a sequence (l n ) with 



limmi¥ ln (T jnkn >C + plog{n/Uj nkn )) = c < 1, 

where the indices j n ,k n are chosen as in step 2. This implies the existence 
of a subsequence [for simplicity also denoted by (l n )] without any subsubse- 
quence having the property (16); that is, we may assume ||Zn||sup/Pn —> 00 ■ 
We will conclude the proof via contradiction as follows: For any subsequence 
of a sequence (l n ) satisfying ||Z n || SU p/pn — ► oo, there exists a subsubsequence 
which either converges to zero at a slow rate or whose maximal absolute 
value stays uniformly bounded away from zero. Hence we need to show that 
in both cases, our test attains asymptotic power 1. 

Note that the squared denominator of is bounded by HV'Hsupj while 

Var/ n (S' n ) is uniformly bounded. Using again the approximation of the nu- 
merator by S n , we obtain 



V 3nkn -p\og{n/Un 

(27) 



> U\\~^ ln S n - v /21og(n/ttJ n ) + (l). 

If there exists a sequence (l n ) with the property ||i n ||sup/Pn — ► oo but which 
converges to zero, 

(28) E ln S n = 2^/54^, QlAj f(y) 2 dy} + VKO(\\l n \\ 2 Int 2), 

as seen in step 2. But then the first term dominates in order the second one 
as well as the logarithmic correction which shows that the right-hand side 
in (27) goes to infinity. 

Otherwise, assume that (l n ) stays uniformly bounded away from zero. 
First observe that with l n := l/(tJ/ n ) Eie/„ ln(Xi), \l n (Xi) -l n (Xi)\ < L\X jn - 
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Xk n \P = 0(h,P). Taylor expansion around l n up to the first order provides 
the approximation 

K ln S n = -L= ^ ^„(Xi){ / (F(y + Z re pQ)) " ^(-2/ " ln{Xi)))f{y) dy 
VVn ieIn U 

= -±= MXi){J (F(y) - F(-v - 2ln))f(y) dy + o(tg) 

= E In S n + 0(n^h^). 

If l n is uniformly bounded away from zero, ~Kj S n is of order not smaller than 
0(\/nh n ) which dominates in order the approximation error \Kj S n — ¥,i n S n \ 
as well as the logarithmic correction. □ 

Proof of Theorem 4. By virtue of the proof of Theorem 3, it remains 
to be shown that (i) there exists some positive constant C = C(/3, L, ip), such 
that (24) goes to infinity for alternatives l n with Kp n > \\lnvh\\sup > Cp n for 
any constant K > C and (ii) ^i n S n goes to infinity whenever ||in||sup/pn — > 
oo. To this aim, we establish the following: If Z G H.(/3,L) with ||Z|| sup 5: 1 
and x* := argmax xg [ 01 ] |Z(x)|, then there exist some constant c = c((3,L) > 

and a closed interval 1(1) C [0, 1] such that \(I(l)) > c\l(x*)\ 1 /^ and 

(29) \l(x)\ > \\l(x*)\ for every x G 1(1). 

Note that this is obviously correct in case (3 < 1 with c = 1/(2L). For > 1, 
let L/9J denote the largest integer strictly smaller than /3. Let Z G 7i(P,L) 
with 1 1 Z 1 1 sup = D > 0. Taylor expansion around any point y G [0, 1] provides 
the approximation 

( r — n\\P\ 

l(x) = l(y) + (x- y)l'(y) + ■■■+ 1 J j Z^%) + i2,(x,y) 

fc! 

with |-R|(aj,j/)| < L|x - y|^(< L). Thus, 



(30) 



(x-^)Z / (y) + ---+ (:g ^^ ^^(y) 



< 2D + L. 



Lemma. There exists a universal constant K = such that for any 
polynomial P of degree d > 0, say P(x) = J2k=o a kX k , and ||-P||[o,i] < D > 0, 
it holds true that sup fc=0 ... d < Kd ' F> . 

The lemma results from the fact that, for the polynomial P(x) = J2k=o a k x 
\\P\\(i) = ll-f ll[o,i] an d 11-^11(2) = max o<fc<d l a fc| are two norms in the (d+ 1)- 
dimensional space of polynomials of degree cZ, and these norms are equiva- 
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lent. Its application implies together with the bound (30) that there exists a 
constant K = K([3) such that \l(x)-l(x*)\ < \\l'\\ SVi p\x-x*\ < K{2D + L)\x- 
x*\. Then \l(x)\ > l/2\l{x*)\ on [x* - D/(4KD + 2KL),x* + D/(4KD + 
2KL)] n [0,1]. If now l n £ TC(f3,L) with ||Z n || SU p = S n < 1, then at least 
[x* - 2~ 1 5n l3 ,x*] or [x*,x* + 2~ l 5n 13 ] is fully contained in [0,1]. Assume 
without loss of generality that [x*,x* + 2~ l 5n^\ C [0, 1]. Then g n is defined 
by g n ( x ) := 2^5- 1 l n (2- 1 5n /P x + x*) for x £ [0, 1] is element of H(/3,L) with 

|| 9n || sup 

= g n (0) = 2@. Thus the above lemma finally implies that \l n (%)\ > 
8 n /2 on [x*,x* + 1/(8K + 4K2-l 3 L)5l //3 }. 

The assumption about implies that there exists some interval [c, d] C 
(0, 1) on which tp(x) > 5 for some strictly positive constant 5. We first 
verify the claim (i). For any alternative l n , let ip n be the kernel rescaled 
onto the interval [Xj n , Xk n ], where the design points Xj n < Xk n are those 
which are closest to the endpoints of I(l n y/h). Let I n := {i :Xi E I(l n V~h)}. 
Then {ij} n ,l n \/h) j B is of order not smaller than H/nV^Hsupi which implies 
the existence of a universal constant C = C(f3, L,tp) such that (24) goes 
to infinity for H/nV^Hsup > Cp n and ||Z n ||sup/Pn = O(l)- The same consid- 
eration also shows that (28) goes to infinity whenever ||^ n ||sup/Pn oo and 
IKn||sup — ► 0, because H^nV^Hsup dominates in order p n ||? n % as well. To verify 
(ii), note that \\l n Vh\\ sup /(AK\\l n Vh\\ sup + 2K L) stays uniformly bounded 
away from zero and infinity as soon as ||^n||sup is uniformly bounded away 
from zero. Thus in the latter case, there always exists an interval I(l n Vh) 
with liminfn^oo X(I(l n y/h)) > and \l n (Xi)yJh(Xi)\ > \\l n y/h\\ sup / 2 for ev- 
ery Xi e I{l n Vh). With I n := {i\Xi E I(l n Vh)} 

o 1 VMyU; ^ E^i? n (i)|signCr t )) 

TPn £ MXl) slgn{Yl) KTl 

If / n (Aj) is uniformly bounded away from zero for every i E I n , the absolute 
expectation of first term is of order (^(y^ra), while the second term is O p (l). 

□ 
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